Subject: Chess History on the Web (2001 no.6) Date: 15 Mar 2001 14:53:32 -0000 From: "World Chess Championship" Site review - Player collections This site review is a continuation of Chess History on the Web (2001 no.5), where we started to look at collections of games on the Web devoted to specific players. I identified ten sites which offer player collections:- (A) Chessaround http://members.aol.com/chessaround/chess/partiendown.html ChessBase format (C) Chesscorner http://www.chesscorner.com/games/download/download.htm PGN format (F) Fernschach http://www.fernschach-international.de/english/frame-download-engl.htm PGN format (G) GMchess http://www.gmchess.com/digest/gamebase/ PGN format (N) NBCI Chessdata http://members.nbci.com/chessdata/players.htm PGN format (O) Ossimitz http://www.crosswinds.net/~ossimitz/player.htm 'Annofritzed' collections in ChessBase format (P) Chesspawn http://www.chesspawn.de/weltmeister.htm ChessBase & PGN formats (U) UPITT http://www.pitt.edu/~schach/ PGN format (*) ChessBase http://www.chessbase-online.com/playerdatabase.htm (*) Homestead Observer http://www.homestead.com/Observer/titled.html The letter preceding the name of the site is a code that I'm going to use to identify the site for the rest of this article. When you see '(P)', it means the Chesspawn site. Since my last article was about Karpov, I was pleased to find that all these sites offered a collection of Karpov games. To prepare that article I downloaded the collection from each site, converted the ChessBase files to PGN, extracted the data from all PGN headers, and loaded the header data into a database for further analysis. For the two sites marked '(*)', I had trouble downloading their data, so I was forced to abandon them. The trouble may have been a temporary glitch or it may have been lack of some technical knowledge on my part. Whatever the problem, I didn't try again while preparing this current article. Instead, I added two other collections to my list:- (M) ChessBase Megabase (extract) sent to be by a correspondent (X) En Passant - Xadrez http://www6.ewebcity.com/hohx/partidas.html (M) is a selection of Karpov games extracted from ChessBase Megabase. I'm going to include it here because it is probably representative of what ChessBase offers on the Web. Of course, if you have the Megabase you don't need to get the games from the Web; if you don't have it, you're not going to recreate it easily from the ChessBase site. I converted the ChessBase games to PGN and added the headers into my database along with the others. (X) is a Spanish language site. I downloaded its Karpov collection, which is in PGN format, and loaded the headers into my Karpov database. Let's forget about Karpov for a few moments and look at what each site has to offer. Many of them also have collections covering events, but I won't be examining those here. (A) is the smallest of the sites. It has collections for Karpov, Kasparov, Kramnik, and Tal. (C) has collections for the first thirteen world champions, Wilhelm Steinitz through Garry Kasparov. (F) has collections for twelve world champions (Max Euwe is missing), plus a two part collection for Khalifman. It also has collections for thirteen correspondence world champions, Purdy through Umansky. Some of these collections are small; Sanakoev, the 12th correspondence world champion, has only 56 games in his collection. In addition to the world champions, there are many collections for grandmasters. (G) segments its collections into 'Grandmasters' and 'Great Masters of the Past'. All collections are marked with the total games and date of last update. (N) highlights a Kasparov collection along with a number of collections for contemporary players, most of whom are still active. (O) has a mixture of collections for historical and contemporary players. The first time I accessed this site, I only received a partial page, listing about half of its player collections. I later discovered this offline when I noticed that I had no record of a Kasparov collection. I revisited the site and received a full page, but on other occasions I again received a partial page. Every time you access a page on this site, a popup window appears, and the partial pages seem to be caused by closing the popup too quickly; I've never noticed this behavior before. Another peculiarity is that the Alekhine link at the top of the page goes to a different site although there is a collection on (O). (P) has collections for the first thirteen world champions plus Khalifman, with a biography of each player. (U) has more player collections than any of the other sites covered here. I described its content in some detail for Chess History on the Web (2000 no.24), where I calculated that there were 275 different players covered by UPITT collections. (X) has collections for a few of the most popular players, both historical and contemporary. Here is a table which shows the number of collections available from each source. Ct Src 4 (A) 13 (C) 63 (F) 64 (G) 23 (N) 45 (O) 14 (P) 14 (X) The 240 player collections cover 133 different players. Here is another table which shows how many sites have collections for the most popular players. Ct Player 8 Karpov 8 Kasparov 7 Alekhine 7 Fischer 7 Tal 6 Capablanca 6 Petrosian 6 Spassky 6 Steinitz 5 Botvinnik 5 Lasker 5 Smyslov Twelve players are covered by five or more sites. Fischer is missing from (A). Alekhine is missing from (A) and (N), but has two collections in (O), three if you count the external link. Tal is missing from (X). Luckily for me, Karpov is one of two players covered by all sites. --- (O) differentiates itself from the other sites in at least two ways. The first unique feature is, as the description of the Karpov collection says, 'Includes a player dossier!'. The Karpov dossier is a document with a recent picture of Karpov, a graph of his rating from 1976 through 1999, a graph showing the number of games in the collection by year, a table of results against opponents with whom he has played at least 20 games, a summary of his opening repertoire, eight positions from 'spectacular games' (two of which are the same position), and eight checkmates delivered by Karpov. The second unique feature is that the game collections are 'annofritzed'. The site's home page informs us that annofritzed means 'annotated with Chess-Supersoftware Fritz'. To gauge the value of an annofritzed game, I took a look at one of Karpov's most famous early games. This game is included in all collections of Karpov's best games that I've seen... [Event "Alekhine memorial"] [Site "Moscow"] [Date "1971.??.??"] [Round "11"] [White "Karpov Anatoli"] [Black "Hort Vlastimil"] [Result "1-0"] 1. e4 c5 2. Nf3 d6 3. d4 cxd4 4. Nxd4 Nf6 5. Nc3 e6 6. g4 Nc6 7. g5 Nd7 8. Be3 a6 9. f4 Be7 10. Rg1 Nxd4 11. Qxd4 e5 12. Qd2 exf4 13. Bxf4 Ne5 14. Be2 Be6 15. Nd5 Bxd5 16. exd5 Ng6 17. Be3 h6 18. gxh6 Bh4+ 19. Kd1 gxh6 20. Bxh6 Bf6 21. c3 Be5 22. Rg4 Qf6 23. h4 Qf5 24. Rb4 Bf6 25. h5 Ne7 26. Rf4 Qe5 27. Rf3 Nxd5 28. Rd3 Rxh6 29. Rxd5 Qe4 30. Rd3 Qh1+ 31. Kc2 Qxa1 32. Qxh6 Be5 33. Qg5 1-0 ...It's a lovely game, typical of Karpov's classical style. Karpov assigns '!?' to Hort's 17th move, and '!' to his own 22nd, 23rd, 24th, & 30th moves. After his 27th move Karpov writes, 'The rook, at times such an awkward piece, in the given position displays miraculous powers of manoeuvre. It creates one threat after another, and operates efficiently not only in attack, but also in defence'. The comment would make an appropriate introduction to the entire game. The annofritzed version says nothing about Karpov's moves 22, 23, and 24. After Hort's 17th move it says 'White is better' and gives the alternative 17... O-O!? 'White is slightly better' as an improvement. Castling into an attack is, however, not everyone's cup of tea and is not easily subject to the kind of concrete analysis where computers excel. For the 30th move the annofritzed version gives 30. Rxd6 Bg5 31. Rd4 Qe7 'White stands slightly better', overlooking 30... Bxc3 31. Qxh6 Be5, which is a more critical line. Nowhere is it mentioned that Hort lost on time. I am sure that many players derive great benefit from annofritzed games. Computers are good at analysis of tactics, but I fear that they lack the positional understanding which separates great players like Karpov from the rest of us. The Karpov dossier says that Karpov has played 255 games against Kasparov, scoring 121 points for a lifetime record of -13 against his long running nemesis. Unfortunately, the (O) collection is one which suffers from excess duplicate games, so the statistics aren't accurate. My own collection of Kasparov games, which does not include Linares 2001, but which should otherwise be complete, contains 172 KK games. The results of the 89 games where Karpov had White are:- 16 1-0 5 0-1 68 1/2 And of the 83 games where Karpov had Black are:- 24 1-0 5 0-1 54 1/2 This gives Kasparov a lifetime advantage of +8. Kasparov scored +1-0=1 against Karpov in the recent Linares event, increasing his advantage to +9. Where do the 83 extra games (255 - 172) come from? The following table gives the distribution of KK games by year for both (O) and for my own Kasparov tournament, match, and exhibition (TME) record. (O) TME Year Ct Ct 1975 1 1 1981 3 3 1984 67 36 1985 43 36 1986 41 24 1987 37 27 1988 7 6 1989 2 1 1990 35 24 1991 9 5 1992 2 1 1993 1 1 1994 1 1 1996 2 2 1999 4 4 In 1984 & 1985, Karpov and Kasparov played exactly 72 games -- 48 in KKI and 24 KKII. The last 12 games of KKI were played in 1985. The 110 games in (O) for 1984-85 include 38 duplicates! In every year from 1984 through 1992, (O) contains duplicates. Even worse, the word 'duplicate' may not be the most accurate term. I found the following game in the UPITT collection... [Event "?"] [Site "24, Wch 1985 Moscow"] [Date "1985.??.??"] [Round "?"] [White "Karpov Anatoli"] [Black "Kasparov Garry"] [Result "0-1"] [ECO "B54"] 1. e4 c5 2. Nf3 d6 3. d4 cxd4 4. Nxd4 d5 5. Nc3 a6 6. Be2 e6 7. O-O Be7 8. f4 Kd7 9. Kh1 Qb6 10. a4 Qa5 11. h4 0-1 ...Its header identifies it as the last game of KKII, but it's bogus. In fact, it's a horrible game, unworthy of two lower level club players, not to speak of the deciding game of a world championship match. I suspect that it was created either as a joke or by someone trying to personalize a collection. I found the same game in both the (F) and (O) collections, annofritzed in the latter. Its result, which is accurate, is included in the statistical section of the player dossier. Unfortunately, it's not the only low quality Karpov game I found in UPITT. It's one of about half a dozen. --- We clearly need good player collections to generate meaningful statistics about players' careers. The collections should be comprehensive and culled of duplicates. In other words, the primary objective for any player collection should be a maximum number of games with a minimum number of duplicates. Secondary objectives are accurate game scores, consistent use of headers, and consistent spelling of opponents' names. Let's classify the different Karpov collections by number of games vs. number of duplicates. There are four possibilities... (1) small collection, many duplicates (2) large collection, many duplicates (3) small collection, few duplicates (4) large collection, few duplicates ...where (1) is the least desirable & (4) is the most desirable collection. The following table shows some statistics on the ten collections (number of games, years of earliest & latest games):- Count First Last (A) 2694 1960 2000 (C) 1051 1961 1994 (F) 3381 1961 1999 (G) 2365 1966 1999 (M) 2692 1961 2000 (N) 1588 1963 1995 (O) 3193 1960 2000 (U) 3079 1960 1998 (X) 2383 1966 1999 Where's (P)? Collections (F) and (P) turn out to be identical, so I deleted (P) from my database. I derived some baseline numbers for 1971 by looking at my own Karpov collection. I had 96 games on my file out of 103 known to have been played by Karpov in nine events in 1971. The following table shows the number of games for 1971 in each collection, the number of games on file for Moscow (the Alekhine Memorial), and the number of games for Hastings. Karpov played exactly 17 games at Moscow and 15 at Hastings. (A) 90 17 15 (C) 96 16 15 (F) 116 21 18 (G) 53 17 15 (M) 72 17 15 (N) 54 17 15 (O) 101 20 17 (U) 113 24 17 (X) 53 17 15 Closer inspection of (X) told me that it was almost the same as the (G) collection, varying only in the last two years of the collection. Looking at the last two tables taken together, I classified the nine collections:- (1) small collection, many duplicates; [none] (2) large collection, many duplicates; (F), (O), (U) (3) small collection, few duplicates; (C), (N) (4) large collection, few duplicates; (A), (G), (M), (X) Is this classification fair? I decided to do the same exercise for a more recent year. I looked at the average number of games played per year by Karpov across all collections and determined that 1988 was his most active year. There are 1300 games for that year in the nine collections, an average of 144 games for 1988. This was not a big surprise. 1988 was the first year since 1983 which did not feature a Kasparov - Karpov world championship match. For the first time in four years, the world's two best players were available for any and all events. Don't forget that their relative playing strengths were not the same as today; they were much closer. Karpov came within a whisker of recapturing the world champion title from Kasparov in 1987 Seville. In 1988, they shared 1-2 places in the 55th USSR Championship, Moscow; played boards 1 & 2 for the USSR at the 28th Olympiad, Thessalonika, where their team finished 1st, six points ahead of England and the Netherlands; and dominated the first three World Cup tournaments held in Brussels, Belfort, and Reykjavik. My own TME has 150 games played by Karpov in 1988, 15 of them in Belfort, and 16 in Brussels. How do the other Karpov collections compare? The following table shows the number of games played in 1988, in Belfort, and in Brussels. (A) 158 15 16 (C) 59 8 8 (F) 193 19 21 (G) 147 15 16 (M) 147 15 16 (N) 85 15 16 (O) 182 16 20 (U) 182 21 18 (X) 147 15 16 In my opinion, the numbers for (A), (G), (M), and (X) confirm their classification as large collections with few duplicates; my classification of the other collections seems to hold as well. Combine this with the number of collections available from each site... Ct Src 4 (A) 64 (G) 14 (X) ...and it looks like (G) is, all things considered, the best site on the Web for player collections. A deeper analysis might try to identify sites which have similar or identical collections -- 'who's borrowing from whom' vs. 'who's creating original material'. This has been a close look at just one player collection from all sites. Had I selected a different player, my results may have been completely different. If you happen to know that one of these sites has not been treated fairly by my review, please let me know. There are always new things to learn about chess history on the Web. Bye for now, Mark Weeks